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Abstract 

We study the distributions of event-time returns and clock-time returns at different mi- 
croscopic timescales using ultra-high-frequency data extracted from the limit-order books 
of 23 stocks traded in the Chinese stock market in 2003. We find that the returns at the 
one-trade timescale obey the inverse cubic law. For larger timescales (2-32 trades and 1-5 
minutes), the returns follow the Student distribution with power-law tails. With the decrease 
of timescale, the tail becomes fatter, which is consistent with the vibrational theory. 
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1 Introduction 



The distribution of asset price fluctuations has crucial implication on asset pricing 
and risk management [[iIjMS]. In the seminal paper for option pricing, Black and 
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Scholes assume that asset prices follow geometric Brownian motion, that is, the 
returns are normally distributed [Q]- It is well-known that, for most financial assets, 
this assumption is merely a rude approximation at large time scale. In addition, 
Fama finds that the portfolio selection in a stable Paretian market is different from 
that in a Gaussian market [Q]. It is natural that the distribution of returns remains 
a hot topic especially when huge databases recording transaction-level time series 
of stocks become available, which enables us to test classic theories and models in 
finance, such as the mixture of distributions hypothesis |@]. 



The modeling of the distribution of price variations in financial markets can be 
traced back to the work of Bachelier in 1900 [0]- Let S{t) denote the price of a 
security at time t. Bachelier submits that the price variation 



AS{t) = S{t) - S{t - At) 



(1) 



is an i.i.d. variable and follows Gaussian distribution with zero mean. As pointed 
out by Mandelbrot an implicit assumption of Bachelier's model is that the vari- 
ance of AS{t) is independent of the price level S{t) per se, which however contra- 
dicts empirical findings. Nowadays, the logarithmic return is usually used 



r{t) = \nS{t) - In S{t- At) 



(2) 



which is a precise approximation of the price growth rate [|9D. In addition, investors 
are more sensitive to the relative price changes than the absolute changes according 
to the Weber-Fechner law [llOn. Quite a few scholars found evidence supporting 
the Brownian motion model lIlOllllL 11211. This model is also called the Bachelier- 
Osborne model since Osborne independently rediscovered the model 



More than half a century after Bachelier's work, a revolutionary breakthrough was 
made by Mandelbrot, who introduced the Pareto-Levy distribution to describe the 
tail of incomes and speculative price returns 11131 Il4l . llSl . Il6l 11711 . The concept of 
Paretian market is soon accepted by mainstream financial scholars [@]. Using high- 
frequency data of the S&P 500 index, Mantegna and Stanley find that the distribu- 
tion of returns can be well characterized by a truncated Levy law [lisll . Mathemati- 
cally, the density of the Levy distribution has a power-law decay in the tail 



fir) 



-ia+l) 



(3) 



where < a < 2. Pareto finds that the income distribution has a universal power- 
law exponent a = 1.5 lll9n . while Mandelbrot finds that the price fluctuations of 
cotton give a 1.7 For the S&P 500 index, it is found that a = 1.4 ifisll . 



In recent years, new evidence provided by Stanley's group shows that the tail dis- 
tributions of many stock indexes and stock prices for the USA markets exhibit an 
inverse cubic law ll20ll2U 12211. where the power-law exponent is found to be close to 
a = 3. In contrast, empirical analyses for other stock markets have unveiled power 
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law tail exponents other than the Levy regime and the inverse cubic law. Makowiec 
and Gnacihski have studied the daily WIG index (the main index of Warsaw Stock 
Exchange in Poland) for five years and found that the distribution of return follows 
power-law behaviors in three parts with a equal to 0.76, 2.03 and 3.88 for the pos- 



itive tail and 0.69, 1.83 and 3.06 for the negative tail [12311 . Bertram focuses on the 
high-frequency data of 200 most actively traded stocks in the Australian Stock Ex- 
change in the period from 1993 to 2002, and reports that the distribution of returns 
has power-law tails with a > 3, which varies with different time interval At from 



10 to 60 minutes [1240. Coronel-Brizio et al. analyze the daily data (1990-2004) of 
the Mexican Stock market index (IPC) and find that the distribution of the daily re- 
turns followed a power-law distribution with the exponent ol^ = 3.33 (positive tail) 
and a~ = 3.12 (negative tail) by selecting a suitable cutoff value n25\\ . Yan et al. 
investigate the daily returns of 104 stocks (76 from the Shanghai Stock Exchange 
and 28 from Shenzhen Stock Exchange) in the Chines stock markets in the period 
from 1994 to 2001 and argue that the tail exponent is a'^ = 2.44 for the positive 



part and a =4.29 for the negative part [12611 . After removing the opening and 



close returns of high-frequency data for the Shanghai Stock Exchange Composite 



index, the tail exponents are much closer to a = 3 [12711 



There are also controversial results for some markets. An example comes from 
the Indian stock market. Matia et al. analyze the daily returns of 49 largest stocks 
in the National Stock Exchange over 8 years (1994-2002) and find that the distri- 
bution of daily returns significantly deviates from the power-law form but decays 
exponentially in the form of P(r) = e^^'' with the decay coefficient P = 1.34 



for the positive tail and /3 = 1.51 for the negative tail [12811 . In contrast. Pan and 
Sinha have studied the the daily data of two stock indices (Nifty, 1990-2006 and 
Sensex, 1991-2006) and found the daily returns are exponentially distributed fol- 
lowed by power-law decay in the tails (a"* " = 3.10 and a~ = 3.18 for Nifty and 



a+ = 3.33 and a = 3.45 for Sensex) 12911 . They also analyze the high-frequency 



data of 489 stocks containing the information about all the transactions carried out 
in the National Stock Exchange (NSE) for two-year period (2003-2004) and ob- 
serve power-law tails with a"*" = 2.87 and a~ = 2.52 for At = 5 and a ^ 3 for At 



ranging from 10 to 60 minutes [13011 



An alternative model for the distribution of returns is the stretched exponential 
family, which serves as a bridge between exponential and power-law distributions 



mm 



fir) = -(-) e-(^>«)^ (r^O) (4) 

where the distribution approaches to exponential when c — 1. The stretched ex- 
ponential model has a very interesting behavior when c — > 0. If c(r/rnY — > /3 as 



c — ^ 0, then the stretched exponential density goes to a power law ll33L l34 |l : 



r. 



m - ■ (5) 
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This framework is well verified by empirical analyses 33, 34] 



A closely relevant issue concerns with the time scale At defining the return. Roughly 
speaking, the tail distribution evolves from power law at small time scale to Gaus 



sian at large scale 113 5h . based on the variational theory in turbulence 11361 |37L |38 



3911 ■ Numerous empirical studies have been performed in various stock prices and 
indexes, such as the S&P 500 index 111], So, [ill], the U.S.A. common stocks [1220, 
the Hang Seng Index for the Hong Kong market ll42ll . and the KOSPI index and 
KOSDAQ for the Korean market [43.1. 

In this work, we utilize a nice database documenting the limit order flow and indi- 
vidual transactions of 23 Chinese stocks traded on the Shenzhen Stock Exchange 
(SZSE). For the Chinese stock market, only very few efforts were taken to inves- 
tigate the return distributions [1261. 12711. To the best of our knowledge, there is no 
literature reporting relevant results at the transaction level. Our main finding is that 
the stock returns obey the inverse cubic law at the transaction level and thinner 
power laws at aggregated timescales. The rest of the paper is organized as follows. 
In Sec. [21 we describe in brief the database we use. We investigate the probabil- 
ity distribution of the event- time returns based on individual trades in Sec. [3.1[ and 
trade-aggregated returns in Sec. [3.2[ The probability distribution of the returns on 
fixed intervals of clock time is discussed in Sec. [3.3[ The last section concludes. 



2 Data sets 



The study is based on the data of the limit-order books of 23 liquid stocks listed 
on the SZSE in the whole year 2003. The limit-order book records ultra-high- 
frequency data whose time stamps are accurate to 0.01 second including details of 
every event. The tickers of the 23 stocks investigated are the following: 000001 
(Shenzhen Development Bank Co. Ltd: 887,741 trades), 000002 (China Vanke 
Co. Ltd: 509,360 trades), 000009 (China Baoan Group Co. Ltd: 447,660 trades), 
000012 (CSG holding Co. Ltd: 290,148 trades), 000016 (Konka Group Co. Ltd: 
188,526 trades), 000021 (Shenzhen Kaifa Technology Co. Ltd: 411,326 trades), 
000024 (China Merchants Property Development Co. Ltd: 133,586 trades), 000027 
(Shenzhen Energy Investment Co. Ltd: 313,057 trades), 000063 (ZTE Corpora- 
tion, 265,450 trades), 000066 (Great Wall Technology Co. Ltd: 277,262 trades), 
000088 (Shenzhen Yan Tian Port Holdings Co. Ltd: 97,195 trades), 000089 (Shen- 
zhen Airport Co. Ltd: 189,117 trades), 000406 (Sinopec Shengli Oil Field Dy- 
namic Group Co. Ltd: 271,389 trades), 000429 (Jiangxi Ganyue Expressway Co. 
Ltd: 1 17,424 trades), 000488 (Shandong Chenming Paper Group Co. Ltd: 120,097 
trades), 000539 (Guangdong Electric Power Development Co. Ltd: 1 14,721 trades), 
000541 (Foshan Electrical and Lighting Co. Ltd: 68,737 trades), 000550 (Jian- 
gling Motors Co. Ltd: 346,176 trades), 000581 (Weifu High-Technology Co. Ltd: 
93,947 trades), 000625 (Chongqing Changan Automobile Co. Ltd: 397,393 trades). 
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000709 (Tangshan Iron and Steel Co. Ltd: 207,756 trades), 000720 (Shandong 
Luneng Taishan Cable Co. Ltd: 132,233 trades), and 000778 (Xinxing Ductile Iron 
Pipes Co. Ltd: 157,321 trades). 

In 2003, there are two kinds of auctions on the SESZ, namely the call auction and 
continuous double auction. The former refers to the process of one-time central- 
ized matching of buy and sell orders accepted during a specified period, while the 
latter refers to the process of continuous matching of buy and sell orders on a one- 
by-one basis. In each trading day, opening call auction is held between 9:15a.m. 
and 9:30a.m., followed by continuous auction (9:30a.m. -1 1:30a.m. and 13:00p.m.- 
15:00p.m.). The orders that are not executed during opening call auction automat- 
ically enter continuous auction. The limit orders submitted and canceled are iden- 
tified by numbers characterizing the aggressiveness and direction (buyer- versus 
seller-initiated) of orders. Specifically, the buyer- initiated (or seller-initiated) orders 
are differentiated into six aggressive catalogs from less aggressive to more aggres- 
sive: canceled orders, orders inside the book, orders on the same best price, orders 



inside the spread, filled orders, and unfilled orders Il44l l45l. 14611. More information 



about the market can be found in Ref. [471 



3 Empirical distribution of returns 



3.1 Probability distributions of event-time returns 



We adopt the midprice of the best bid bi{t) and best ask aj(t) of stock i as the price 
at time t after a transaction occurs: 

s.m = M!)±*w . 

where t is the event time corresponding to single trades. Indeed, the concept of 
event time was introduced some two score and odd years ago to study the distribu- 
tion of returns ll48ll . We then define the event-time return after At trades for stock i 
as the logarithmic midprice change: 

r,it) = \n[S,it)/S,it-At)]. (7) 

In this section, we focus on At = 1 trade. In order to treat all the returns for 
different stocks as an ensemble, we deal with standardized returns 

where /i^ and cTj are respectively the mean and standard deviation of returns for 
stock i. 
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In our analysis, we treat the 23 groups of standardized returns gi{t) as an ensemble. 
The empirical probability density function f(g) is estimated, as shown on the left 
panel of Fig. [T] We find that f(g) can be well modeled by a Student density 114911 : 



f{g\a,m,L) 



B 



1 a 
2' 2 



a 



L{g 



m] 



Q + l 

2 



(9) 



where a is the degrees of freedom parameter (or tail exponent), m = (g) is the 
location parameter, L is the scale parameter, and B(a, h) is the Beta function, that 
is, B{a, h) = r(a) T{h) / T{a + h) with r(-) being the gamma function. By the 
definition of gi, we have {gi) = and thus m = (g) = 0. Nonlinear least-squares 
regression gives a = 3.1 and L = 1.9. The fitted curve is drawn on the left panel. 



° positive tail 
negative tail 
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Fig. 1. (Color online) Empirical distribution of event-time returns with At = 1 transaction. 
Left: Empirical probability density function f{g) of the event-time returns g aggregating 
the 23-stock data. The solid line is the Student density with a = 3.1, ?n = and L = 1.9. 
Right: Empirical cumulative distributions P{g) for positive and negative normalized returns 
g. The solid lines are the least squares fits of power laws to the data with a"*" = 3.14 it 0.02 
for the positive tail and a~ = 3.00 it 0.02 for the negative tail. 

According to the left panel of Fig. [H the Student density fits nicely the tails of the 
empirical density f{g). The fitted model deviates the empirical density remarkably 
for small values of \g\. For large values of \g\, the Student density function f{g) 
approaches power-law decay in the tails: 



fig) 



[-g)'(^ +1) for ^ < 



(10) 



The empirical cumulative distributions P(g) of the event-time returns for positive g 
and negative g are illustrated in the right panel of Fig. [T] Both positive and negative 
tails decay in power-law form with a+ = 3.14 ± 0.02 and a~ = 3.00 ± 0.02 in line 
with the tail exponent estimated from the Student model. We note that the positive 
and negative tails are not asymmetric. These results indicate that the standardized 
returns obey the inverse cubic law. 
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3.2 Probability distributions of aggregated event-time returns 



We now turn to investigate the distributions of aggregated event-time returns, where 
At spans several trades. By varying the value of At, we are able to compare the 
PDF's at different time scales. Specifically, we compare the PDFs for At = 2, 4, 8, 
16, 32 trades with that for At = 1 trade. The empirical f(g) functions averaged over 
23 stocks for different time scales At are illustrated in Fig. Oa). We represent the 
distribution of one-trade return for comparison. It is evident that the tail is heavier 
with the decrease of At. This phenomenon can also be characterized by the kurtosis 
of the distributions. As listed in Table [H the kurtosis of each PDF is significantly 
greater than that of the Gaussian distribution whose kurtosis is 3, indicating a much 
slower decay in the tails. In addition, the kurtosis decreases with respect to the scale 
At. We also notice that the PDF for At = 32 trades decays slower than exponential. 
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Fig. 2. (Color online) Empirical distributions of aggregated event-time returns at different 
time scales At = 1, 2, 4, 8, 16, 32 transactions. Panel (a): Empirical densities f{g) of 
the aggregated event-time returns. Panel (b): Empirical cumulative distributions P{g) for 
positive (upper cluster of curves) and negative (lower cluster of curves) returns g. 



We have fitted the six curves using the Student density model ^ and the parameters 
L and a are listed in Table \T\ In Fig. [2tb), we study the tail distributions of the 
aggregated event-time returns g for different time scales At = 2, 4, 8, 16, 32 trades. 
It is observed that both positive and negative tails follow power-law distribution. 
We have estimated the tail exponents, which are presented in Table [B Note that the 
scaling range decreases with increasing At, which is also observed for two Korean 
indexes [i43l1 . As expected, the tails decay faster for larger At, which is consistent 
with the behavior of kurtosis. In other words, the tail exponent increases with At, 
which is validated by Table [T] It is interesting to note that the PDF's for large At 
deviate significantly from the inverse cubic law. We also find that < a < a^, 
which implies that the distributions are asymmetric and is in line with the positive 
skewness. It seems that the sign of a~ — a"*" is not universal across different stock 
markets [ | 43^1 . 
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Table 1. Characteristic parameters for aggregated event-time returns. 



Basic statistics Student density Positive tail Negative tail 



At 


Skewness 


Kurtosis 


L 


a 


Scaling range 


a+ 


Scaling range 


a 


1 


0.005 


34.21 


1.9 


3.1 


2.4 ^ ^ ^ 60.3 


3.14 ±0.02 


2.1 ^ ^ 60.3 


3.00 ±0.02 


2 


0.026 


26.99 


1.9 


3.2 


2.6 ^ ^ ^ 45.7 


3.18 ±0.02 


2.4 ^ ^ 45.7 


3.02 ±0.03 


4 


0.051 


23.06 


2.0 


3.3 


2.6 ^ ^ ^ 33.9 


3.33 ±0.03 


2.4 ^ ^ 33.9 


3.17 ±0.03 


8 


0.106 


19.53 


1.9 


3.5 


2.6 ^ ^ ^ 25.1 


3.63 ±0.04 


2.4 ^ ^ 28.2 


3.39 ±0.04 


16 


0.157 


16.13 


1.8 


3.7 


2.6 ^ ^ ^ 15.8 


3.87 ±0.05 


2.4 ^ ^ 19.2 


3.50 ±0.05 


32 


0.106 


13.62 


1.9 


4.0 


2.6 ^ ^ ^ 11.9 


4.11 ±0.06 


2.6 ^ -5 ^ 13.1 


3.96 ±0.07 



3.3 Probability distributions of clock-time returns 



In this section, we handle clock-time returns rj(t) defined over a fixed time interval 
At. The normalized returns gi{t) are calculated similarly for each stock i. Five 
different time intervals are selected with At = 1, 2, 3, 4, 5 min. The mapping 
from event time to clock time is nonlinear, which is determined by the local trading 
frequency. The trading frequency, as a measure of stock liquidity, changes from 
time to time and from stock to stock. The average trading frequencies per minute 
for individual stocks are estimated: 15.74, 4.81, 9.03, 2.08, 7.94, 2.13, 5.14, 2.03, 
3.34, 1.22, 7.29, 6.14, 2.37, 1.67, 5.55, 7.05, 4.71, 3.68, 4.92, 2.34, 1.72, 2.79, 3.35. 
This strongly nonlinear mapping implies that the distributions of returns in the two 
catalogs may behave differently. 

Figure [Sj^a) shows the empirical probability density functions of the returns for 
At = 1, 2, 3, 4, 5 minutes. We observe a nice scaling that the five density functions 
collapse onto a single curve when \g\ is less than 10, which is in agreement with 
the fact that the kurtosis listed in Table [2] remains unchanged approximately. This 
scaling is not surprising. Comparing the kurtosis listed in Table [2] with those in 
Table [Tl it seems that the five groups of the clock-time returns are comparable to 
those aggregated event-time returns with At = 8 to 16 trades. Indeed, the tail 
distributions differ from each other. We have fitted the five curves to the Student 
model © and presented the parameters L and a in Tabled We find that a increases 
with At, as expected. Note that these distributions decay faster than the inverse 
cubic law. 



10 



10 



3 



10 



10 



-20 



-10 





g 



10 



' * ^ 


-^1 min 




-^2 min 




^«-3 min 




-^4 min 




^-5 min 


^^^^^^^^^^^^^^^ 









10 



A 10" 



10 



10 



20 



10 



10' 



^ 1 min 
V 2 min 
< 3 min 

4 min 

5 min 



10" 



Fig. 3. (Color online) Empirical distributions of clock-time returns at different time scales 
At = 1, 2, 3, 4, 5 minutes. Panel (a): Empirical densities f{g) of the clock-time returns. 
Panel (b): Empirical cumulative distributions P{g) for positive (upper cluster of curves) 
and negative (lower cluster of curves) normalized returns g. 



We investigate the behavior of tail distributions of the clock-time returns g for dif- 
ferent time scales At = 1, 2, 3, 4, 5 min in Fig. [3tb). We found several similar 
characters as those for the aggregated event-time returns and the reason is possi- 
bly that when the time interval becomes large, the number of trades in this interval 
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increases. Both the positive and negative tail exponents are estimated, which have 
been listed in Tabled Again, the inequality a~ < a < a'^ holds roughly. 



4 Conclusion 

Return is among the most important variables in the study of financial markets and 
its distribution has crucial implications in asset pricing and risk management. In this 
work, we have investigated a nice database constituting ultra-high-frequency data 
extracted from the limit-order books of 23 stocks traded on the Shenzhen Stock 
Exchange during the whole year of 2003. We have studied two types of returns 
based on event time and clock time, respectively. We find that the distributions 
of returns at different microscopic timescales {At = 1, 2, 4, 8, 16, 32 trades for 
event-time returns and At = 1, 2, 3, 4, 5 minutes for clock-time returns) show 
power-law tails. All the distributions at different timescales can be well modeled 
by Student distributions with different tail exponents. For both types of returns, the 
tail exponent increases with the timescale At and the exponent for the positive tail 
is greater than that for the negative tail at a fixed timescale. The inverse cubic law 
is observed only for the one-trade event- time returns. 
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